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ABSTRACT 

In real time problem is the finding the top-k profitable 
product in the existing market. To find in the best 
product in the market such the new product could not 
dominate by other product in the market. 
Authentication and skyline analysis are best decision¬ 
making applications. In the existing focus the 
customers to analysis the set of product from a pool of 
given products. Set of product in the existing market 
want to identify possible product such that novel 
product are not dominated by other in market. There 
are two problem occasion of the finding top-k 
preferable products are studied. In first problem 
instance to set the cost of these products maximized 
number of total in income. 

Keywords: prediction engine, top-k product 

1.Introduction: 

Skyline: 

Dominance and skyline analysis has been well 
recognized in cost based on product quality prediction 
engine. A package which is not dominated by any 
other packages is said to be a skyline package or it is 
in the skyline. The packages in the skyline are the best 
possible tradeoffs between the two factors in 
question. [2] 

The skyline operator is important for several 
applications involving decision making. Skylines are 
related to several other well-known problems, top-K 
queries and nearest neighbor search. 


l.l.Existing System 

To find top k profitable products a inexpert way for 
this illustration problem is to itemize all possible 
subsets of size k from the obtainable set and then 
analyze the sum of the profits of each possible subset 
and finally select the subset with greatest sum. 

To find top k popular products is an immature way for 
this instance problem is similar to that of the first 
instance. 

First find all possible subsets of size k from the 
available set and then choose the subset with greatest 
number of customers.[3] 

Existing Algorithm: 

Present algorithms to find top-k totally and partially 
unexplained sequences and classes. 

For ease of presentation, we assume |f.obs| = 1 for 
every OID f in an observation sequence (this makes 
the algorithms much more concise generalization to 
the case of multiple action symbols per OID is 
straightforward. [4] 

Given an observation sequence v = (f 1; . . . ; FN), we 
use v (i; j) (1 < i < j< n) to denote the sequence S = 
(fi; si); . . . ; (fj; sj), where SK is the only element in 
fk.obs, i< k < j. 
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Disadvantages of Existing System: 

This way used to finding top k profitable products 
(first problem instance) is not scalable because there 
are an exponential number of all possible subsets. 

These procedures used for finding top k popular 
products (second problem instance) are also not 
scalable because this also involves finding all possible 
subsets which is exponential. 

1.2.Proposed System 

To find top k profitable products: 

In proposed system the dynamic programming 
approachs, which finds an optimal solution when 
there are two attributes to be considered. Here we are 
utilizing the option of find Optimal Incremental 
Property Algorithm. In which, we are trying to 
authenticate/discover the quasi dominance of the 
products and apart from that our system will 
recognize the skyline checks on the available data. 
Based on an optimized check among the two 
methodological jargons, the profitable products will 
be identified. 


In this module, we are using the attributes as the main 
criteria to define and decide the Top K Products. In 
case of preferable products, we have taken the 
summation of the user ratings as the criteria to 
identify the best products. Over Profitable products; 
our system utilizes the car cost, duration of the car 
and user rating. Based on the mixed summarization 
and summation, we have identified the Profitable 
products in the category. 

The adaptive pulling strategy related to the products 
prioritizes access among the two relations based on 
the observed data. [3]The main idea behind this 
approach is to read the tuples from a relation only if 
there is possible evidence regarding the new tuples 
which will help and satisfy the termination condition. 
Intuitively, this prioritization process helps the 
algorithm terminate faster and sooner, thus improving 
its performance. Obviously, the popular products 
based on skyline processing are done. 


2.Proposed System Architecture Diagram 
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In this architecture the user can input the rating for 
product, and other product detail also input in the 
data. Prediction engine gather the information from 
the user rating and other product detail. And analysis 
the both data by using prediction, then separate the 
dominance and suppressed product. Dominance 
product means high level product in market level and 


suppressed product means low level product. In 
dominance product find the top k product and analysis 
product quality and cost for the product. 

Advantages of Proposed System 

For the first problem of finding top-k profitable 
products, a dynamic programming approach which 
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can find an optimal solution when there are two 
attributes to be considered is proposed. 

An incremental approach is used to handle dynamic 
datasets that change over time. 

3. Literature Survey: 

TITLE: Mining top-K frequent item sets from 
data streams 

The difficulty of mining top K frequent item sets in 
data streams is taken care in this project. The author 
has introduced a method based on the Chemoff bound 
with an assurance of the output quality and also a 
bound on the memory usage. 

The author has proposed an algorithm based on the 
Lossy Counting Algorithm. In most of the 
experiments of the two proposed algorithms, we 
obtain perfect solutions [5]and the memory space 
occupied by our algorithms is very small. 

MERITS 

This paper proposes “Chemoff bound” which is a 
bound on the probability that an arbitrary variable 
deviates by a certain amount from its expectation. 

This algorithm clearly focusses on the memory usage; 
“unpromising item sets will be pruned regularly to 
keep the memory usage low” is the basic underlying 
concept of this paper. 

DEMERITS 

The author didn’t focus on the performance of the 
system and accuracy of the data retrieved. 

TITLE: Mining Top-K Patterns from Binary 
Datasets in presence of Noise 

This project directed on the discovery of patterns in 
binary dataset in many of the applications, e.g. in 
electronic commerce, TCP/IP networking, Web usage 
logging, and etc. information on some of the factors 
like: overlapping vs. non overlapping patterns, 
presence of noise, and extraction of the most 
important patterns only. 

In this paper the author have formalize the problem of 
discovering the Top-K patterns from binary datasets 
in presence of noise, as the minimization of a novel 
cost function. 

According to the minimum Description Length 
principle, the proposed cost function favors succinct 


pattern sets that may approximately describe the input 
data. [6] 

MERITS 

This project determines the exact top level data from 
the binary datasets even in the presence of issues with 
the data. 

They utilized the PANDA algorithm to identify the 
level of noise and rectification technique on the noise 
data. 

DEMERITS 

The author didn’t clarify the factor of specifying the 
level of noise identification in this project which is a 
great drawback of this system. 

TITLE: Mining and Representing User Interests 
the Case of Tagging Practices 

In this paper, we provide a novel approach for 
clustering user-centric interests by analyzing tagging 
practices of individual users. 

The FCA (Formal concept analysis) and a 
significance measure are based on the weight of the 
tags in the given data set. 

The concept analysis technique makes it easy to mine 
common tags with respect to users in the data set. [7] 

MERITS: 

The approach can be used to suggest new social 
relationships within a small-size group based on the 
users’ interests. 

It is easy to aggregate user interests from multiple 
sources. 

DEMERITS: 

It is not straightforward to build general-level 
information for the given data. 

We can’t carry out the building of large community- 
level informations by adopting the approaches. 

4.METHODOLOGY 

Find Optimal Incremental Property Algorithm 

> Input: A set Qi—1(= {ql, q2, ..., qi—1}), tuple qi in 
Q0 and the optimal price assignment vector vi-1 
of Qi—1 
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> Output: the optimal price task vector vi of Qi with 
the h-dominance constraint 

The relational database could contain too many 
disconnected components, in which case our link analysis 
approach is almost useless. 

■ Deletion of an existing package 

■ Insertion of a new package 

■ Modifying the attribute values of an existing 
package 

Method are used to analysis the data in the database 
which are identify the top-k preferable and profitable 
product. 

■ Authentication Module 

■ Product Details Screen 


To find top k popular products: 


Proposed System - Example 1 


Profitable price with one package 

o If price is $100 profit is Null 
o If price is $400 Profit is $300 but dominated by P2 

o If price is $300 Profit is $200; Also it is not dominated by any other package 


Package 


Destination to beach 


7.0 units 
4.0 units 
1.0 unit 
3.0 units 


Packa 

Destination to beach 

Cost 

Price 

ge 




Ql 

5.0 units 

100 

? 


4.5 units 

200 

? 

El 

0.5 units 

400 

? 


6. Conclusion 

In this paper, how the user find the Top-k profitable 
as well as preferable product are not dominated by 
any other product existing in the market. This work 
proposes to choose the best product to get maximum 
profit i.e. multiple decisions are arise when select the 
product, even though it suggests best product to get 
maximum profit and also all the problems for finding 
Top-k profitable product are solved and synthetic data 
has been used the result obtained are practically and 
theoretically accurate for testing. It is also suitable for 
real time data sets (i.e. companies real time trading 
records connected to server). 
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